Calculating median*after multiple imputation

Ashmita Karki

Join Date: Jan 2024

Posts: 9
#1

Calculating median*after multiple imputation

03 Mar 2024, 01:47

Hi there,

Could anyone please guide me how to calculate median of outcome variable after performing multiple imputation?

I want to calculate median instead of mean because of non-normal distribution of outcome variable. mi estimate doesn't support median so I am struggling to calculate median values.

If it was mean, I would use the following code

mi estimate: mean EQ5D_QOLscore if I_C==0, over(B_E)

However, I can't use median with mi estimate.
Tags: None
Felix Bittmann

Join Date: Aug 2018

Posts: 838
#2

03 Mar 2024, 02:03

To get a p-value for the difference of the two groups you can also try:

Code:

mi estimate: qreg EQ5D_QOLscore i.B_E if I_C==0

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
Ashmita Karki

Join Date: Jan 2024

Posts: 9
#3

03 Mar 2024, 05:49

Hi Felix, thank you. The code you have mentioned above does give me differences in median values between the groups, however, I wanted a code that could help me calculate median values of each group too. Is there a code for median like the following code for mean: mi estimate: mean EQ5D_QOLscore if I_C==0, over(B_E)?
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 838
#4

03 Mar 2024, 10:33

If you want to see the medians you can do it like that:

Code:

ssc install mimgrns, replace cap drop samp mi estimate, saving(test, replace) esample(samp): qreg outcomevar i.groupvar mimrgns groupvar using test, esample(samp)

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
Dirk Enzmann

Join Date: Apr 2014

Posts: 600
#5

03 Mar 2024, 16:14

In #4 it should be

Code:

ssc install mimrgns, replace
1 like
Comment
Ashmita Karki

Join Date: Jan 2024

Posts: 9
#6

05 Mar 2024, 05:19

Thank you Felix and Dirk. Very helpful
Comment
Sifan Cao

Join Date: May 2021

Posts: 14
#7

17 Feb 2025, 21:12

Originally posted by Felix Bittmann View Post

If you want to see the medians you can do it like that:

Code:

ssc install mimgrns, replace cap drop samp mi estimate, saving(test, replace) esample(samp): qreg outcomevar i.groupvar mimrgns groupvar using test, esample(samp)

Hi Felix and Dirk, How can I get q25 q75 from this procedure?
Comment
Sifan Cao

Join Date: May 2021

Posts: 14
#8

17 Feb 2025, 21:13

Also, any way to run 'ranksum' with 'mi estimate'?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#9

17 Feb 2025, 22:39

Re #7, -qreg- has a -quantile()- option, which you can specify as -quantile(.25)- or -quantile(.75)- Also, if it is actually the interquartile range you want, not the 25th and 75th percentiles themselves, you can get that with the -iqreg- command.

Re #8. Yes, I think so. But it's a lot of work. Also, generally when a Stata command doesn't support something, it's because the command cannot produce statistically valid results when used in that way. The fact that -ranksum-, or some equivalent command, is not support suggests that it may fail to meet the requirements for Rubin's rules to be valid.*

Anyway, if you want to try to do this, you need to use the -cmdok- option in the -mi estimate- prefix so that Stata will run -mi estimate- even though the command is not supported. Even then, it will not run with -ranksum- because that is not an estimation command. So you will have to wrap -ranksum- in an -eclass- program with all of the properties required of programs for running under -mi estimate-. Those requirements can be seen by running -help program_properties##mi-.

May I first ask you, however, to think carefully about whether you have a plausible case for your missingness to be either MAR or MCAR before you invest a lot of time and effort into this.

*I raise this as a precaution. The actual test statistic calculated by -ranksum- has an exact z-distribution, and is calculated from -ranksum-'s -r(sum_obs)-r(sum_exp)- in the numerator, and a variance that -ranksum- returns in -r(Var_a)-, the square root of that variance being in the denominator. This is rather analogous to an OLS regression coefficient, so I think Rubin's rules would actually work correctly with it. But it's been decades since I learned the details of this and I may be overlooking or misremembering something here. There are others on the Forum who are more knowledgeable and up-to-date on multiple imputation, and I would be happy to stand corrected by one of them if I have this wrong. Anyway, if I am right, making the b and V matrices from -r(sum_obs)-r(sum_exp)- and -r(Var_a)- respectively, and posting those and the other required statistics in -e()- will probably do the trick.
1 like
Comment

Announcement

Calculating median*after multiple imputation

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment